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Predicting the hydrological behavior in hydrographic basins composed of 
high Andean ecosystems that have a variety of climates, with complex 
geology, highly varied topography, and soils with a high content of organic 
matter that generate a very heterogeneous vegetation cover, is very 
difficult, and if it is added the scarcity of hydrometric information in 
hydrographic networks causes great uncertainty when planning the use 
of water resources. The predominant trend for prediction is through 
hydrological models that relate precipitation and runoff, which require 
historical information that ¡is not available in most cases. The application 
of the artificial neural networks technique allows a methodology adaptable 
to the information available in each basin to analyze the relationship 
between precipitation and runoff. Because of its robustness, results can 
be obtained with great precision. This research aimed to estimate and 
predict the average monthly flows for the Crisnejas river basin, located in 
the northem region of the Peruvian Andes, for which there were historical 
records of 12 meteorological stations and a hydrometric station, using 
flow data, precipitation, temperature and  normalized difference 
vegetation index (NDVI), with a multilayer perceptron-type artificial 
neural network, which achieved a goodness of fit of 81 % in the coefficient 
of determination. Then with the generated record, another network of the 
recurrent type was trained to predict monthly mean flows for eight years 
with a goodness of fit of 71 %. 


Keywords: Monthly flows, artificial neural networks, monthly flow 


prediction. 
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Resumen 


Predecir el comportamiento hidrológico en cuencas hidrográficas 
compuestas por ecosistemas altoandinos que tienen una variedad de 
climas, con geología compleja, topografía muy variada y suelos con alto 
contenido de materia orgánica generadoras de una cobertura vegetal muy 
heterogénea es muy difícil, y si a ello se adiciona la escasez de 
información hidrométrica en las redes hidrográficas se genera gran 
incertidumbre al planificar el aprovechamiento del recurso hídrico. La 
tendencia predominante para la predicción es a través de modelos 
hidrológicos que relacionan precipitación y escorrentía, los cuales 
requieren información histórica no disponible en la mayoría de los casos. 
La aplicación de la técnica de redes neuronales artificiales, en contraste, 
permite disponer de una metodología adaptable a la información 
disponible en cada cuenca para analizar la relación entre precipitación y 
escorrentía, y gracias a su robustez se pueden obtener resultados con 
gran precisión. El objetivo de esta investigación fue estimar y predecir los 
caudales promedio mensuales para la cuenca del río Crisnejas, ubicada 
en la región norte de los Andes peruanos; para ello se contó con registros 
históricos de 12 estaciones meteorológicas y una estación hidrométrica, 
utilizando datos de caudal, precipitación, temperatura e índice de 
vegetación de diferencia normalizada (NDVI), mediante una red neuronal 
artificial del tipo perceptrón multicapa, con bondad de ajuste del 81 %. 


Luego, con el registro generado de caudales, se entrenó otra red del tipo 
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recurrente para predecir caudales medios mensuales de ocho años con 
una bondad de ajuste del 71 %. 


Palabras clave: caudales mensuales, redes neuronales artificiales, 
predicción de caudal mensual. 
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Introduction 


Estimating the water supply is a recurring problem in hydrology when 
there is no adequate record of flows in the basin. For this purpose, 
theoretical models are available based onthe interrelation ofthe variables 
of the water cycle and the processes that help determine the amount of 
water available at a point of interest. The information available and 
required in the basins determines the characteristics of the model that 
can be applied in each case; therefore, sometimes, simplifications or 
assumptions must be made regarding the variables or the hydrological 


cycle, depending on the information required by the model. The selection 
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of the variables and the amount of data determines the model's predictive 
capacity (Cabrera, 2012). 


Torres and Granados (2019) mention that traditionally, hydrological 
analysis has been based on the availability of climatological and 
hydrological information in a hydrographic basinthat allows, together with 
the analysis of geographical, geological, and environmental conditions, 
the simulation of natural phenomena diverse such as drought, floods, 
sudden floods, availability of water supply among others that turn out to 
be essential inputs for the integral management of water. However, 
classical modeling protocols cannot be applied when instrumentation is 
lacking in a basin. Hydrologists face the problem of indirectly quantifying 
water resources, sometimes with little scientific support. Researchers 
such as Alipour and Kibler (2019), and Choubin et a/. (2019) agree that 
the reliable estimation of the flow, especially in uncalibrated basins, is of 
utmost importance for environmental management and planning and the 
prediction of the flow in uncalibrated basins is necessary to support the 
decisions taken around the best use of water. 


Sivapalan and Wagener, cited by Hrachowitz et al. (2013), indicate 
that at the beginning of the new millennium, a community awareness had 
been reached that hydrological theories, models, and empirical methods 
were largely i¡inadequate for predictions in uncalibrated basins. 
Furthermore, there was a need to understand better links between 


hydrological function, that is, how a watershed responds to inputs and 
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shape, that is, physical properties of a watershed, to address the 
challenge of uncalibrated watersheds adequately. 


Therefore, in the last decades, there has been a need to find new 
methodologies capable of improving the precision of flow predictions in 
uncalibrated basins. Alipour and Kibler (2019) present a method for the 
prediction of current flow under the extreme data scarcity model (SPED), 
a framework designed for the prediction of current flow within regions of 
dispersed hydrometeorological observation, while Razavi and Coulibaly 
(2016), and Choubin et al. (2019) propose to consider the integral 
characteristics of the watersheds through multiple model approaches to 
improve the continuous estimation of the daily flow in uncalibrated 
watersheds through regionalization, the process of  transferring 
hydrological data from calibrated to non-calibrated watersheds. Currently, 
many researchers are including digital elevation models to improve the 
approximation in the calculations; this is the case of Althoff, Ribeiro, and 
Neiva-Rodrigues (2021), who present a methodology based on the use of 
the terrain analysis toolset using the model elevation (TauDEM) to obtain 
the input variables for the regionalization model averaged for the 
catchment area of each pixel in the flow network grid. 


Hrachowitz et al. (2013), after concluding their research entitled “A 
decade of predictions in uncalibrated basins (PUB): a review”, they found 
that the main factors that contribute to the resulting predictive 
uncertainty, which were identified by the PUB initiative, include : a) An 


incomplete understanding of the set of processes that undenie the 
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response of the hydrological system, and the feedbacks at the catchment 
scale between these processes, which frequently results in inherently 
unrealistic models with high predictive uncertainty; b) An incomplete 
understanding of the multi-scale Spatio-temporal heterogeneity of 
processes in different landscapes and climates, as the vast majority of 
small catchments around the world were, and still are, unmeasured with 
little or no information available; and c) Inadequate regionalization 
techniques to transfer understanding of hydrological response pattems 
from measured to unmeasured environments due to a lack of cross-basin 
comparative studies and a lack of understanding of the physical principles 
that govern sound regionalization. 


In small basins, or cases in which little data are available, or specific 
precipitation events, the direct relationship between rainfall and runoff 
can be determined using regression methods (Osbom, 1969), deriving 
equations that can relate the flow with the rain and/or more variables 
(USACE, 1971). These techniques give greater flexibility in terms of the 
information required, although with a more significant number of 
assumptions and without a known interrelation between the variables 
involved in the process, compared to hydrological models. Furthermore, 
by the nature of the method, the extrapolation of values is limited, non- 
linear relationships cannot be solved without transforming the inputs, and 
it is sensitive to outliers. 


In contrast to the flow estimation techniques and models described, 


artificial neural networks (hereinafter ANN) have advantages in that it is 
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not necessary to know the physical relationship between the variables 
involved in the problem, they are robust (they do not have high sensitivity 
to errors in the input patterns), the input variables can be adapted to the 
available data (Delgado, 1998) and depending on the type of ANN, they 
can be applied in recurring processes to make time series forecasts. For 
Herrera, Leiva, and Romero (2020), in hydrology, there are many cases 
where neural networks have been used to predict the behavior of a 
variable based on previous historical data and a set of predictor variables 
since theirresearch addressed the particular problem of reconstruction of 


missing information from meteorological stations using RNAs. 


In the last decades, the use of neural networks in hydrological 
modeling has increased due to their fundamental property as a universal 
and parsimonious approximator of non-linear functions. In the field of 
flood forecasting, feedforward and recurrent multilayer perceptrons have 
confirmed their efficiency (Darras, Johannet, Vayssade, Kong-A-Siou, 8 
Pistre, 2018). As the sustainable management of waterresources requires 
forecasting of flows in short times, hydrological challenges that Steyn 
(2018) and Lama and Sánchez (2020) propose to face with the application 
of machine learning techniques both to treatthe discontinuity of the data, 
as well as to work with flows that follow non-linear or stationary 
behaviors. While Brenes (2020) further specifies the prediction of the 
houny average flow using Machine Learning models based on decision 


trees, comparing their predictive capacity at the Palmar hydrological 
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station, located on the Grande de Térraba river in the South-Pacific region 
of Costa Rica. 


For Heras and Matovelle (2021), computational methods based on 
machine leaming have had wide development and application in 
hydrology, especially for modeling systems that do not have enough data. 
Within this problem, there are missing data series that should not 
necessarily be discarded; This is achieved by completing them, 
understanding that this requires combining approaches or methodologies. 
In this sense, some investigations have been developed that have had 
satisfactory results, such as that of Canchala, Alfonso-Morales, Carvajal- 
Escobar, Cerón, and Caicedo-Bravo (2020), evaluated the performance of 
the combination of three Artificial Neural Networks (ANN) approaches in 
the forecast of monthly rainfall anomalies for southwestem Colombia, or 
that of Farfán, Palacios, Ulloa and Avilés (2020), who propose a hybrid 
technique, using the time series generated by the individual models as 
inputs to a new ANN. This approach aims to increase the precision of the 
simulated flow by combining and exploiting the information provided by 
physical and data-driven models. 


In the Crisnejas River, located in nortthem Peru, there is a monthly 
flow record of 13 years in two periods separated by a data gap of 37 
years; however, complete records of precipitation and temperature are 
available at many weather stationsin and around the basin. This situation 
is common in basins of the Peruvian coast and highlands of great interest 


inimplementing hydraulic projects for which itis necessary to know water 
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availability. The short registration period prevents an adequate 
probabilistic estimation of the persistence of flows, and for this reason, 
the registration must be completed based on rainfall-runoff relationships 
(ANA, 2015). In this sense, in the country, hydrological models are 
frequently applied for monthly flows, such as that of Témez, of global 
valuation and basins below 10,000 km2 (Témez, 1977) or the model 
developed by Lutz Scholz, within the framework of the Technical 
Cooperation of the Republic of Germany for the Meris II Plan, and which 
applies only to basins in the Peruvian highlands (Scholz, 1980). 


The aforementioned hydrological models require simplifying the 
precipitation data from the stations into an average record within the 
basin, eliminating variability. The same occurs with temperature, and in 
the case of Témez, it is also required to estimate the average potential 
evapotranspiration (ETP) in the basin. Still, there is not always sufficient 
data, and one must opt for ETP estimation models based ontempenatures. 
In the calibration and validation process of these models, absurd values 
can be found in parameters such as aquifer discharge or delay and runoff 
coefficients since they cannot always be applied in the basin of interest or 
there are simply deficiencies in the input data. 


Faced with the proposal to estimate monthly flows through the 
aforementioned hydrological models, the ANNs do not eliminate the 
variability of the precipitation data from the different climatic stations but 
instead establish theirinfluence on the output data implicitly or internally. 


Similarly, it happens with the temperature data or the additional variables 
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that can be considered in the analysis. Furthermore, calibration is 
unnecessary since the ANN will seek to "“learm” how it should relate the 
inputs to reach the output with the least possible error (Delgado, 1998). 


This provides a lot of flexibility regarding input information and the quality 
of the results. 


Therefore, this research aims to apply artificial neural networks 
(ANNs) to estimate the missing flow data in the Crisnejas river from data 
on precipitation, temperature, and vegetation cover quantified by the 
Normalized Difference Vegetation Index. (NDVI) of an average year. 


Background 


Artificial neural networks are a computational technique inspired by 
the work of the biological neuron model and threshold logic of Warren 
McCulloch and Walter Pitts in 1943; the principle of the perceptron was 
established in 1958 with its limitation to solving only separable problems 
lineanly, itis not until 1975 when the reverse propagation algorithm or ' 


backpropagation ' is proposed, and this limitation is resolved (Delgado, 
1998). 
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Investigationsthat directly apply artificial neural networks (ANN) to 
solve complex hydrological problems have been increasingly frequent, 
given the large number of computational tools developed in recent years 
for training ANNs and their different algorithms and types. A summary of 
the previous works that precede this research is presented below. 


The Journal of Hydrologic Engineering (ASCE Task Committee on 
Application of Artificial Neural Networks in Hydrology, 2000) presents an 
article that compilesthe possible applications of ANNs in various branches 
of hydrology, such as rain-runoff, flows, groundwater, water quality, and 
precipitation. Itindicatesthat, with adequate training, ANNs can generate 
satisfactory results for predicting problems in hydrology. 


Dawson and Wilby (2001) propose a protocol for implementing 
artificial neural networks in precipitation-runoff processes and flood 
prediction in which mention is made of a process of normalization or 
typification of the data in the range that is accepted by the wake-up 


function. 


Kalteh (2008) performs a precipitation-runoff and ANN modeling 
using precipitation, temperature, flow, and time data. His research 
concluded with reasonable precision in the estimation of flow through 
ANNSs; in addition, he points them out as promising tools not only in model 
precision but also in the leamed relationship since he used neural 
interpretation methods to interpret the connection between the weights 
of the network. 
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In his research, Laqui (2010) uses the  precipitation, 
evapotranspiration, and flow data of the Huancané River (Peru) for the 
training of a multilayer perceptron type ANN with the ' backpropagation ' 
algorithm and compares ¡ts results with a series model stochastic 
temporal, obtaining a betterfit with the ANN. 


Herrera-Quispe, Yari, Luque, and Tupac (2013 also used multilayer 
perceptron ANN with the Levenberg-Marquardt algorithm to generate 
stochastic monthly flows in the Chili River basin (Peru) in combination 
with the Thomas-Fiering stochastic model. 


Gomes-Villa-Trinidad (2016) presents, in his masters thesis, a 
prediction model of monthly contributions using ANN in the Amambaí river 
basin (Brazil). Their conclusions showed that using ten hidden neurons 
could obtain better results than with networks of 15 to 25 neurons. In 
addition, it concludes that ANNSs are a very efficient alternative to perform 
flow predictions in contrast to the Naive model of trivial prediction. This 
research also compiles the methodology proposed by Dawson and Wilby 
(2001) in the form of a protocol to implement precipitation-runoff models 
with ANN. The study also used ANN of the multilayerperceptron type with 
the Levenberg-Marquardt algorithm. 


Materials and methods 
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Methodological proposal 


To determine the historical monthly flow in the period 1965-2017 and 
make ¡ts prediction in eight years, the training of two artificial neural 
networks of the multilayer and recurrent NAR perceptron type is 
proposed. 


For the first network, the training patterns have the following data 
as input: 


e Precipitation. Registered monthly (1965-2017) in 12 meteorological 
stations in the study area (limit of the basin and surroundings). 

e Temperature. They were recorded (1965-2017) by 5 of the 12 
previous stations. 

e Ground cover. Quantified from the NDVI and obtained from 
multispectral images of each month's average hydrological year. 

e Flow rates. Short monthly record (1965-1976, 2014-2019), used in 
three parts: one for the training of the multilayer perceptron type 
network (1968-1976, 2016), another for the validation of said 
network (2014, 2015, 2017) and another shorter period (2018- 
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2019) for the validation of the prediction made with the recurrent 
type network NAR. 


The diagram proposed in Figure 1 shows the process followed for 
training and prediction with both networks. 
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Figure 1. Methodology for estimating flows. 


For the second network (RNN NAR), only the throughput data 
estimated with the MLP ANN ¡s required. 


Currently, there are many tools for training artificial neural 
networks, from programming languages such as Python orR to programs 
with a graphical interface such as MATLAB. For this case, the training of 


the MLP-type ANN has been done by encoding the backpropagation 
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algorithmin the VB.net language. In the case of NAR-type RNN, the ' Tíme 
Series app' of artificial neural networks from MATLAB 2015 has been used. 


Hydrological balance 


The proposed methodology for estimating the monthly flow (m 3 / s) in 
the Crisnejas river basin is based on the approach of the most influential 
variables in the basin's water balance. According to Fattorelli and 
Fernández (2007), the hydrological model of a basin is based on the 
processes that integrate the phases of the hydrological cycle. In a basin, 
we can find several variables classified into inputs (precipitation), outputs 
(runoff, underground flow, evapotranspiration), and storage variation. All 
these variables are interrelated, as shown in Equation ¡Error! No se 


encuentra el origen de la referencia. : 
AS=P-0Q-G-ET (1) 


Where: 
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AS = storage in mm/year per basin area 

P = precipitation in mm/year by basin area 

Q = flow in mm/year by basin area 

G = flow of groundwater out of the basin in mm/year per basin area 


ET = evapotranspiration in mm/year by basin area 


When analyzing each of the variables, it is observed that the 
knowledge of precipitation is essential forestimating the flow; inthis case, 
itis considered independent of other factors and is measured data already 


considered in the ANN input pattern. 


The underground flow depends on the cover, the type of soil, and 
the geology; These last two are considered constant on the monthly time 
scale and the global period analyzed (53 years); therefore, the parameter 
to be quantified coverage. In this sense, the quantification of this 
parameter has been proposed through the NDVI or Normalized 
Differential Vegetation Index according to Huete and Tucker (1991), in an 
average year. 


Evapotranspiration, according to Allen, Pereira, Raes, and Smith 
(2006) is the combination of two separate processes by which water is 
lost through the soil surface by evaporation and transpiration of 
vegetation. There are many equations or methods for ¡ts estimation. In 


this research, its simplest conceptualization has been taken. Thornthwaite 
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(1948) poses Equation ¡Error! No se encuentra el origen de la 
referencia., which gives an estimate of the ETP in mm/day: 


ETP = 16(10x T/D”" (2) 
Where: 


T = temperature in *C. 
IT = annual heat index, which is a function of the monthly temperature. 


a = parameter asa function of I. 


This way, potential evapotranspiration does not need to be entered 
directly into the model since it can be expressed as a function of 
temperature. Its behavior will also be improved from the NDVI since, in 
reality, it also depends on the basin's coverage. 


Storage is related to complex processes in which coverage, soil 
type, geology, infrastructure, and relief must be considered. Its variability 
is not significant in the investigation's period and time scale; therefore, it 


isa constant. 


Finally, the conceptual model is proposed to estimate the monthly 
flow based on precipitation (P), temperature (T), and NDVI. 
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Q =f(P,T,NDVI) (3) 


Multilayer perceptron network and backpropagation 
algorithm 


According to Isasi-Vinuela and Galván-León (2004), unlike the simple 
perceptron, the multilayer perceptron allows for solving non-linearly 
separable problems. This type of network is composed of several hidden 
layers that will enable decision regions. The multilayerperceptron, or MLP 
(Multi-Layer Perceptron), is usually trained through the reverse 
propagation algorithm or backpropagation (Back Propagation), which is 
why the name of back propagation network also knows it. 


RNAs of the multilayer perceptron type (Figure 2) are composed of 
an input layer, one or more intermediate or hidden layers, and an output 
layer. Each of the neurons in the previous layers connects with all the 
neurons in the following layers. The information propagates in one 
direction; once the information is presentedin the ANN in the input layer, 
it reaches the output layer through the hidden ones; this process is called 
feedforward. 
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Multilayer perceptron. 


Each neuron receives a linear combination (summation) of the 


information affected by the so-called "weights" and is then evaluated by 


the "activation function", the same one that generates the input for the 


next layer, as interpreted from Equation ¡Error! No se encuentra el 


origen de la referencia., according to Delgado (1998). The weights are 


adjusted through the training process, for which there are algorithms such 


as backpropagation that are combined with error minimization 


A 2023, Instituto Mexicano de Tecnología 
del Agua.Open Access bajo la licencia CC BY -NC-SA 4.0 
(https ://creativecommons.org/licenses/by -nc-sa/4.0/) 


Tecnología y ciencias del agua, ISSN 2007-2422, 
14(1), 124-199. DOI: 10.24850/j-tyca-14-01-04 


open (o)access | y) Check for updates | 


Tecnología y 


CienciaszAgua 


techniques, such as gradient descent, Levenberg-Marquardt, Newton, or 
conjugate gradient: 


y =0Q.W x*x+W0) (4) 
Where: 


y = neuron output. 


g = represents the activation function (F.A.); it can be of the tangent, 


logistic, identity, ReLU, Gaussian, or other types. 
x = inputs. 
W = weight. 


WO = activation threshold. 


Backpropagation algorithm 


The backpropagation algorithm to train an MLP (Multi-layer perceptron or 
multilayer perceptron) architecture consists of five elementary steps, 
according to Larranaga, Inza, and Moujahid (1997), which are: 
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Step 1. Randomly set initial weights and thresholds (t: = 0, initial 
epoch). 


Step 2. For each pattern in the training set: 


2.1 Execute a phase to obtain the network's response in the 
pattern. 


2.2 Calculate the total error in the output layer. 


2.3 Calculate the partial increase in weights and thresholds 
due to each training pattern. 


Step 3. Calculate the current total increment, extended to all 
patterns. The same procedure is carried out with the thresholds. 


Step 4. Weights and thresholds are updated 


Step 5. The total error is determined, and if it is not acceptable, all 
the patterns are presented to the network again. The algorithm is 
repeated from Step 2 until satisfactory results are obtained (t: = t + 1, 
next epoch). 


Blanco (2016) indicates that the backpropagation algorithm is 
usually combined with some learning algorithm such as the delta rule or 
the gradient descent method. Withthe latter, the training of the ANN used 
in this research has been carried out. 
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The proposed training scheme is shown in Figure 3. The main 
characteristics of the ANN used to estimate the historical record of 
monthly flows for the period 1965-2017 are: 


e ANN Type : multilayer perceptron 

e Training algorithm . reverse propagation 

e Combined algorithm a gradient descent 

e Unique activation function > hyperbolic tangent 

e ANN structure : 7-5-4- 1 ineurons per layer 
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Figure 3. Trained Multilayer Perceptron Artificial Neural Network. 
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This scheme has been obtained from multiple trial and error 
processes, which acquired the best training results and the extension of 


untrained values. 


Recurrent neural network 


Pérez-Ortiz (2002) explains in his doctoral thesis that the way an ANN's 
neurons are interconnected defines a directed graph. If the graph is 
acyclic, we are dealing with the most common case of a forward- 
propagating or feedforward ANN, a type of network in which the multilayer 
perceptron-type RNAs seen above are found. In the case that the network 
has cycles, itis called Recurrent Neural Network. In this type of network, 
the existing cycles have a profound impact on the learning capacity of the 


network and make them more efficient for processing time series. 


A recurrent neural network (RNN) can be of several types. In this 
research, a NAR (Nonlinear Autoregressive) type RNN is also known as a 
non-linear auto-regressive model. Their state is a combination of the 
previous pattern's inputs and outputs, making them ideal for time series 
prediction. In addition to incorporating the inputs above, the prior 
network outputs are added. 
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In this research, an RNN NAR has been trained to predict the 
synthetic time series from MLP ANN in a future period of 8 years. Said 
training was carried out in MATLAB with the “Time Series app” module, 
which is opened by executing the 'nnstart* command on the command 
line. The characteristics of the network (Figure 4) are as follows: 


. ANN type E Recurrent NAR 

. Training algorithm a Reverse propagation 

. Combined algorithm  : Bayesian regularization 
. ANN structure : 12 neurons 

. Delay : 96 values 


The delay of 96 values (eight years) has been established based on 
the number of years to be projected. 
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a network does not perform well after training. 
Number of Hidden Neurons: 12 
The network will be created and trained in open loop form as shown below. 

Number of delays d: 96 


Training with open loop (single-step) prediction is more efficient than with 
closed loop (multi-step) prediction because it allows us to supply the 
Problem definition: y(t) = f(y(t-1),....y(t-d)) network with correct feedback inputs even as we train it to produce the 
correct feedback outputs. 


After training, the network may be converted to closed loop form, or any 
other form, that the application requires. 


Restore Defaults 


Neural Network 


Hidden Layer with Delays Output Layer 


E) Change settings if desired, then dick [Next] to continue. 
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Figure 4. RNN NAR training in MATLAB 2015. 


With the network trained, the following lines of code are executed 


using a MATLAB 'script”, which allows propagating or making the forecast 
from the information trained by the RNN NAR. 
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T=tonndata(CAUDAL,false,false); % correct information for network 

lx1,xio,aio,t] = preparets(net,(),(),T); % prepares the information for the type of network 
[y1,xfo,afo] = net(x1,xi0,aio); %spreads information over the network 

[netc,xic,aic] = closeloop(net,xfo,afo); %generates a closed network from the previous one 


[y2,xfc,afc] = netc(cell(0,96),xic,aic); % carry out the propagation 96 months or eight years 


Data processing 


Protocol for the implementation of ANN in precipitation 


models-runoff 


Dawson and Wilby (2001) propose a protocol for the implementation of 
ANN in rain-runoff models, which consists of the following steps: 


19 Collect data. 


2% Select the prediction model 
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3rd Data preprocessing - stage 1: eliminate jumps and trends, if 
necessary, and remove seasonality. Select the variables to predict and 
the variables that will make the prediction, and choose the most 
influential. 


4 * Choose a type of ANN: type of network, training algorithm. 


5th Data preprocessing - stage 2: scaling the data according to the 
output range of the chosen trigger function. For this step, Equation 


¡Error! No se encuentra el origen de la referencia. ): 


(Ls—Li)xY+(Li-Mz-Ls:mz) 
A 


Mz-mz (5) 
Where: 


Z = Climbing series 
Mz, mz = maximum and minimum value of series Y, respectively. 
Ls, Li = upper and lower limits to adopt, respectively. 


Y = value to be scaled. 


6% Train the ANN. 


7 Validate the ANN. 
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The pre-processed information in the first stage has been scaled. 
The parameters required to scale each of the variables towards the 
working range of the hyperbolic tangent function (-1 to 1) are shown in 
Table 1. The entire range of the function has not been used by the 
recommendation of the protocol as mentioned above, but the values have 
been scaled in such a way that there is a maximum of 0.9 and a minimum 
of -0.9 in each variable. 


Table 1. Parameters to scale the variables to the working range of the 


hyperbolic tangent activation function. 
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S. Matara 430.20 


0.00 
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.0 


Puente Crisnejas 205.60 


Collection and processing of meteorological and 


hydrometric information 


The meteorological stations are unevenly distributed within the basin and 
its surroundings. Those better spatially distributed in latitude, longitude 
and elevation, and that also have reliable records over long periods have 
been selected. The information has been compiled from the stations 
shown in Table 2. The stations are located as shown in Figure 6. 
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Table 2. Hydrometeorological stations. 
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To execute the first stage of data preprocessing proposed by 


Dawson and Wilby (2001), outliers were filtered using Tukey's (Tukey, 
1977) box plots, andjumps in the mean were verified and corrected using 
non-parametric statistics tools, as the test of accumulated deviations of 
Buishand (1982), as well asin the variance through the test of Fligner- 
Killeen (Fligner € Killeen, 1976). Also, trends were analyzed with the 
Mann-Kendall test Kendall (1975). All the previous process was carried 
out in R 3.4.0 language, with the Trend and Climtrends packages. The 
data filling was carried out with the HEC-4 model of the US Army Corps 
of Engineers (1971), which is based on multiple regressions between each 
month of registration and between stations. This first stage has been 


carried out following the flow chart shown in Figure 5. 


Tecnología y ciencias del agua, ISSN 2007-2422, 


del Agua.Open Access bajo la licencia CC BY -NC-SA 4.0 - . AALDA 
(https ://creativecommons.org/licenses/by -nc-sa/4.0/) 14(1), 124-199. DOI: 10.24850/5 tyca-14 01-04 


en Qucczss | y) Check for updates 


Tecnología y 


CienciaszAgua 


INICIO 


¿Se pueden eliminar 
los valores atípicos? 


¿La información se 
ajusta a una distribución 
normal? 


Si 


(2) No 


Normalización 


Usar pruebas 
paramétricas 


Análisis de saltos 
Corrección de saltos 
Análisis de tendencias 


Ajuste o eliminación 
de componente de tendecia 


Figure 5. Hydrometeorological information pre-processing flow 
diagram. 
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Collection and processing of cartographic information 


The basin has been delimited using an ASTER-GDEM digital elevation 
model. In addition, 12 multispectral images were acquired from the 
Landsat program corresponding to each month of the hydrological year, 
as shown in Table 3. These images have been used to determine the NDVI 
using the Tucker equation (Huete €: Tucker, 1991): 


NIR —Red 
NIR +Red (6) 


NDVI = 


Where: 


NIR = band corresponding to the near-infrared. 


Red = red spectrum band. 


Table 3. Landsat images were used to determine NDVI. 
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1992 Landsat 4-5 Less than 10 % 


1991 Landsat 4-5 Less than 10 % 


Prior to the calculation of the NDVI, the corrections and 
transformation of digital levelsto physical parameters of each ¡image were 
carried out, following the flow chart of Figure 7, adapted from Chuvieco 
(1996). The processing was done in QGIS 2.18, using the Semi-Automatic 
Classification Plugin. 
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Figure 7. Multispectral Imaging Flowchart. 


2023,Instituto Mexicano de Tecnología 
del Agua.Open Access bajo la licencia CC BY -NC-SA 4.0 
(https ://creativecommons.org/licenses/by -nc-sa/4.0/) 


0 Check for updates 


Tecnología y ciencias del agua, ISSN 2007-2422, 


14(1), 124-199. DOI: 10.24850/j-tyca-14-01-04 


a | y) Check for updates 
OPEN ACCESS , 


Tecnología y 


CienciaszAgua 


Study area 


The Crisnejas river basin (see Figure 8) is located in northern Peru, in the 
departments of Cajamarca and La Libertad. The delimitation has been 
made from the point located on the Crisnejas bridge (Table 4), where a 
hydrometric station is installed that has recorded the river levels for more 
than 30 years but whose height-flow curves are not found available to 
transform this information into flows. There are only 13 years of daily flow 


measurements. 


Table 4. Location of the Puente Crisnejas hydrometric station. 


Location 


UTM-WGS 1984 
GCS WGS 1984 
Zone 17S 


[SEUS Mad) 


Puente Crisnejas | 818705 Moll 79 27' 48. seal 78% 6' 47. Elsa 1988 
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Figure 8. Crisnejas river basin. 
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Results 


Analysis of cartographic information 


In addition to delimiting the basin, the normalized difference vegetation 
indices (NVDI) have been determined for each month of a hydrological 
year assumed as an average, as shown in Table 5 and Figure 9. In some 
months, cloud cover did not allow obtaining the NDVI in some areas of 
the basin, however, since the required numerical data is an average, 
Figure 10 shows the spatial distribution of the NDVI. information was not 
completed, and only the average of what was captured in the survey was 


obtained from the image. 


Table 5. NDVI, the monthly average forthe training of ANN MLP. 


May 


NDVI 0.47 |0.44|0.39|0.32/0.32|0.32/|0.31|0.36|0.34 
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NDVI promedio mensual 


Figure 9. Normalized Difference Vegetation Index - NDVI, monthly 


average. 
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Figure 10. NDVI calculation. 
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Analysis of hydrometeorological information 


The processing of the hydrometeorological information resulted in 
obtaining time series of precipitation and monthly temperature 
homogeneous both in the mean and in the variance and free of trends 
and atypical values. In addition, the record of all meteorological stations 


was standardized by extending the short record time series (Figure 13 
and Figure 14). 


In general, the behavior of the hydrological cycle in the region 
shows a wet season from Septemberto March and a dry season from April 
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Figure 11. The temperatures show behavior with higher values. High in 
the wet season and lower in the dry season, except for the San Juan 
station, where the reverse occurs (see Figure 12). Initially, it was thought 


to discard this station. However, ¡it was not eliminated since its behavior 
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could enrich the behavior of the ANN MLP; if not, it is the weights in 


training that rule out its influence. 
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Figure 11. Average monthly precipitation in mm, for complete and 
extended records in 1965-2017. 
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Figure 12. The average monthly temperature in *C, for records from 
1965-2017. 
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Figure 13. Total annual precipitation in mm, 1965-2017. 
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Figure 14. Total annual precipitation in mm, 1965-2017. 


The application of non-parametric statistics tools has allowed the 
time series analysis to be more reliable and consistent with the expected 
hydrological behavior of the studied region. 


The same procedure has been followed for the flow analysis of the 
Puente Crisnejas hydrometric station (Figure 15). 
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Figure 15. Homogenization of flows (m3 / s) at Puente Crisnejas 
station. 


Training of the multilayer perceptron (ANN MLP). 
Estimation of flows in historical record 1965 to 2017 


The ANN MLP training shows a high fit between the measured data and 


the data trained by the network, as seen in ¡Error! No se encuentra el 
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origen de la referencia.. Validation has been carried out between the 
reserved data of the training to evaluate the predictive capacity of the 
network for untrained employers. As expected, the data from trained 
patterns (¡Error! No se encuentra el origen de la referencia.) 
present, in general, a better fit with the measured than the information 
generated from untrained patterns (¡Error! No se encuentra el origen 
de la referencia.). Even so, said information shows a high degree of 
goodness of fit according to the measures or coefficients considered by 
Cabrera (2012). 


Table 6. The goodness of fit of flow rates estimated by MLP-type ANN 


Trained period Untrained period 
(1968-1976, 2016) (2014, 2015, 2017) 


Qualification Qualification 


Goodness-of-fit 


measures! 


Calibration 


coefficient (r) Correlation Correlation 


The determination strong positive strong positive 


coefficient (r?) 
Schultz 
o . Very good 8.65 Good 
coefficient(D) 
Cumulative mean 
Ñ 18.92 
deviation (MAD) 
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Nash-Sutcliffe 


0 0.96 Excellent 0.77 Very good 
efficiency (E) 


Mass balance error 


(m) in % 


Root mean square 
error (RMSE) 


lCabrera (2012). 
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Figure 16. Leaming - ANN MLP for trained monthly patterns. 
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Figure 17. Validation - MLP ANN for untrained monthly standards. 


As seen in the slopes of the regression lines, ¡Error! No se 
encuentra el origen de la referencia. indicates a good fit between the 
information measured and that estimated with the MLP ANN, both for 
trained and untrained pattems. 
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Figure 18. Dispersion of monthly flows, ANN MLP. 


Regarding the assessment of the accumulated mean deviation 
(MAD), it is important to clarify that this parameter is intended to be as 
close to O as possible since it represents the average of the differences 
between the observed and estimated data. The value of 6.35 of the data 


of the trained period and 18.92 of the untrained period can be directly 
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interpreted as the "average error" in m3/s between the estimated 


information and the measurement in said periods. 


The mass balance error (m) represents, in quantity, the relationship 
between the volume of the observed hydrograph and the simulated one. 
In the same way, it has a better evaluation the closer it is to O. In this 


case, there is less error in the data generated for the untrained period. 


The root means square error (RMSE) quantifies the magnitude of 
the deviation between the measured and estimated values; similanly, a 
value closerto O implies a betterfit. Again, the trained period presents a 
better fit for this particular case than the untrained period. 


The record of monthly flows generated with the MLP ANN for the 
period between 1965-2017 is shown in Figure 19. 
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Figure 19. Record of monthly flows (m*/s) estimated with the ANN 
MLP, 1965-2017. 


Recurring network training (RNN NAR). Flow forecast 
in the record from 2018to 2025 


The monthly flow data estimated with the MLP ANN were used (in its 
scaled form) to train the RNN NAR, which outputs the projected monthly 
flow data until 2025, as shown in Figure 23. 
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The forecast analyzed jumps and trends before being taken as valid, 
having to discard the response given by several trained networks. Finally, 
the data that did not need to go through corrections of this type were 


selected. 


It should be mentioned that, given the nature of the network and 
the noise in the data, some negative values are usually generated that 
are absurd in the forecast. In this case, there were six values of the 96 
months. However, they were purged and replaced by the immediately 
higher positive value. To validate the forecast, the network had to be 
trained many times until it learned almost perfectly the behavior of the 
monthly flows' time series to reduce errors in the forecast. The training 
results generated in MATLAB are shown in Figure 20 and Figure 21. The 
correlation value in the training period is almost perfect. The validation 
done by MATLAB also indicates a moderate positive correlation. Monthly 
data from the last two years were reserved to determine the precision of 
the forecast; Table 7 shows the goodness of fit. 


Tecnología y ciencias del agua, ISSN 2007-2422, 
14(1), 124-199. DOI: 10.24850/j-tyca-14-01-04 


del Agua.Open Access bajo la licencia CC BY -NC-SA 4.0 
(https ://creativecommons.org/licenses/by -nc-sa/4.0/) 


a | UL) Check for updates 
OPEN ACCESS 


Tecnología y 


CienciaszAgua 


Training: R=1 Test: R=0.60449 


Output -= 1*Target + -2.2e-08 
Output -= 0.46*Target + -0.13 


1 0.5 0 0.5 1 40.5 0 0.5 
Target Target 


All: R=0.98045 


Output -= 0.97*Target + -0.01 


-1 -0.5 0 0.5 1 


2023,Instituto Mexicano de Tecnología 
del Agua.Open Access bajo la licencia CC BY -NC-SA 4.0 
(https ://creativecommons.org/licenses/by -nc-sa/4.0/) 


Tecnología y ciencias del agua, ISSN 2007-2422, 
14(1), 124-199. DOI: 10.24850/j-tyca-14-01-04 


Output and Target 
¿ o 
a 


5 
EN 


ES a | Mm Check for updates 
OPEN ACCESS J 
cnología y 


Te 
CienciaszAgua 


"Il 


50 100 150 200 250 300 350 400 450 500 
Time 


Response of Output Element 1 for Time-Series 1 
$ T | 


o 


+ — Targets- Outpuls 


Figure 21. RNN NAR response for the trained time series. 


Table 7. The goodness of adjustment of flows estimated by the ANN 
type NAR, comparative of periods measured and forecast 2017-2019. 


Flow forecast 
Goodness-of-fit measures (2018-2025) 


Calibration coefficient (r) 


The determination coefficient (rA 2) 


Schultz (D) .25 
Cumulative mean deviation (MAD) 13.61 MI 
Nash-Sutcliffe efficiency (E) 0.64 Very good 
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Mass balance error (m) 33.55 


Root mean square error (RMSE) 23.55 WM 


In general, the data present an acceptable fit, considering that they 


are forecasts and their value can always be affected by variables not 
controlled in the simulation (demand growth or climate change) and 
training of the RNN NAR. 


The forecast could be less certain the further it is from the last 
measured data, given that the forecast error becomes larger with each 
step of the propagation, taking into account that each data generated 
depends on the last 96 data, which supports the reason why the forecast 
of a long period with this type of technique is not convenient. 


Forecast data is displayed in Figure 22 and the complete senes is 


shown in Figure 23. 
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Figure 22. Forecast of monthly flows, period 2018-2025. 
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Figure 23. Monthly flow forecast at Puente Crisnejas station, extended 
until the year 2025. 
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The annual average of flows is shown in Figure 24 to observe a 
summary of the behavior predicted by the RNN NAR. 


Estimated annual flow record, period 1965-2025 
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Figure 24. Estimated record of the mean annual flow, period 1965- 
2025. 
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Discussion 


The artificial neural networks used in the research have allowed the 
estimation of the missing monthly flow record and the forecast of these 
flows, resulting in a total synthetic series of 61 years of record (1965- 
2025). This registry provides a betteroverview of the river's watersupply 
for planning and developing future water use projects. 


The complete record does not show a significant trend in the data; 
however, the forecast series shows low flow values. This could be due to 
the errors in the measurement of the initial years with which the 
multilayer perceptron was trained; even when the information was 
corrected for jumps, the variation is noticeable between the information 
measured in 1968-1976 with the period 2014- 2019. Unfortunately, this 
factor cannot be controlled, given the lack of metadata in the hydrometric 
station. 


Despite the above, the results of this research demonstrate the 
robustness of recurrent multilayer perceptron-type artificial neural 
networks (ANNs) in the generation of synthetic series of monthly flows 
from meteorological information with a high goodness of fit. In tum, an 
adaptable procedural basis is shown for ¡ts extrapolation in basins with a 


similarrecord of information and even for cases in which a bettertemporal 
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resolution is required, such as daily, a result thatis compatible with those 
found by Lama and Sánchez (2020), who evaluated the effect of 
decomposition techniques to use them with a recurrent neural network 
called short-term long memory to increase the precision of the daily 
prediction of the Chira river flow in northern Peru. Likewise, Lee, Lee, and 
Yoon ( 2019) and Heras and Matovelle (2021) obtained prediction results 
that showed good performance with minimum mean square errors with 
high correlation coefficients, ensuring that the ANN models are suitable 
for evaluating complex hydrological and hydrogeological water systems. 


The technique used has made it possible to use the largest amount 
of measured and available information on the basin without having to 
resort to preliminary simplifications in the variables (estimation of other 
variables using empirical equations) of the hydrological cycle and 
resulting in a complete record with a high goodness of fit. 


Using non-parametric statistics tools has made it possible to 
simplify information analysis. It has not been necessary to resort to 
normalizations or other techniques that give validity to the data to be 
applied with traditional statistical tests. It is important to bear in mind 
that you have worked with a relatively large amount of data and that, in 
future research or work that requires a better temporal resolution, the 
amount of information to pre-process before training could be very 
complex if it is not considered this aspect. 


Other research works on ANN forthe generation of synthetic series 


of monthly flows, such as that of Laqui (2010), shows that a scheme 
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based on an MLP ANN with current and antecedent precipitation and 
evapotranspiration inputs shows a better correlation between what is 
measured and estimated than just with current precipitation and 
evapotranspiration data, for your case. This is not necessarily decisive in 
all basins, considering the delay of each one or other factors that could 
influence the monthly  hydrological behavior. As no other 
conceptualization of the basin has been investigated in terms of its 
variables, the training has been improved, modifying the conFiguretion 
parameters of the network, such as the number of layers or neurons and 
even the activation function, and an even higher correlation coefficient 
has been obtained. Gomes-Villa-Trinidad (2016) applies the neural 
networks in the flow forecast of the following month using an MLP ANN, 
However, since ¡ts objective differs from this research's, the 
conceptualization of the training patterns is also different. In ¡its case, it 
uses the flow of the previous historical month and the rainfall and 
temperatures, achieving good results in predicting the flow for the 
following month. However, as previously said, the generation of a 
synthetic series is not sought but rather a forecast. The forecast for the 
present research was carried out historically and with another type of 
network architecture (RNN NAR) since this synthetic record allows a useful 


long-term visualization in decision-making. 


The MLP ANN scheme trained in this research is a good starting 
point for future research that requires the generation of synthetic series 


of monthly flows. It is important to point out at this point that, unlike 
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other research, here it is not has carried out transformations between 
measured variables; the variables taken in the field have been those that 
trained the MLP ANN, which shows the advantage of artificial neural 
networks in terms of taking advantage of the greatest amount of 


information measured in the basin. 


As seen in the results, neural networks and satellite information 
have wide applications in estimating records and forecasting flow in the 
short or long term. The researchers understanding and adequate 
selection of variables in the study only limit them to the process that 
requires modeling. Well, Herrera et a/. (2020), like us in their research, 
propose models based on artificial neural networks and satellite 
information for filling in missing data in meteorological stations and spatial 
reconstruction of precipitation and temperature variables for the region 
of the Department of Valle del Cauca, Colombia, with results obtained that 
reach correlation coefficients of around 0.9. 


Future research could also analyze the trained weights, determine 
the influence of field measurements at each station with respect to flow, 
and even try to interpret the behavior through regional equations. 
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Conclusions 


The generation of the historical and forecast series of flows through 
training of artificial neural networks has been satisfactory and with a high 
goodness of fit, which allows us to have a solid base in terms of decision- 
making in future projects of water use of the basin. 


This work shows the technique's robustness and high capacity for 
adaptation and use of the information measured in the basin. A protocol 
adaptable to basins with similar hydrometeorological records has been 
shown, such as a large number of basins on the Peruvian coast and 
highlands that otherwise would have to resort to precipitation-runoff 
models that do not always give at least acceptable results orthat require 
lengthy calibration processes or additional field measurements concerning 
the parameters required by each model. 


In addition, a scheme and conFiguretion of ANN MLP and RNN NAR 
are presented as a starting point in similar analyses. 


The methodology used can be extrapolated to many cases since 
techniques have been used for the analysis, correction, and processing of 
meteorological data that are characterized by their wide range of 
application in different types of data, in this case, non-parametric 


statistical techniques and artificial neural networks, for which there are 
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multiple free-to-use tools. In addition, these techniques give good 
adjusted results without the need to resort to assumptions or make 
assumptions about the data, and they have not had to resort to calibration 


processes. 


Finally, the information provided by this research shows the 
feasibility of using artificial neural networks to estimate synthetic series 
of monthly flows, both in historical records and in forecasts. 
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